Method for Capturing, Measuring and Analyzing Motion

ABSTRACT

A system and method for capturing, measuring and analyzing motion in real-time is provided which operates with or without markers. A threshold image is based on a background image and is created to minimize noise during the motion capture. A captured image is compared to the threshold image to identify collections of hot pixels or globs. The globs are compared to expected characteristics of the markers or the subject and are tracked between frames. Glob information is used to determine the location of the markers or subject in each frame.

RELATED APPLICATION

This application is a continuation of U.S. patent application Ser. No.11/011,540 entitled “Method for Capturing, Measuring and AnalyzingMotion” filed Dec. 13, 2004, which claims the benefit of priority toU.S. Provisional Application No. 60/528,666 entitled “System and Methodfor Motion Capture,” U.S. Provisional Application No. 60/528,880entitled “System and Method for Measuring, Animating and AnalyzingMotion,” and U.S. Provisional Application No. 60/528,735 entitled“Camera for Measuring, Animating and Analyzing Motion,” all of whichwere filed Dec. 11, 2003. The complete disclosure of each of theabove-identified applications is hereby fully incorporated herein byreference.

TECHNICAL FIELD

The present invention is directed in general to providing a system andmethod for capturing and analyzing motion, and in particular tocapturing motion using a background image and a threshold image.

BACKGROUND

Motion capture systems provide the ability to measure and analyze themotion of humans, animals, and mechanical devices. Once the motion iscaptured, it can be used for a variety of purposes. For example, thecaptured motion can be used to animate a computer-generated model of thesubject so that the subject's motion can be analyzed or the motion canbe used to animate a character created for a motion picture.

Ideally, a motion capture system captures the subject's motion withoutinterfering with the subject's motion, analyzes the captured motion inreal-time, and provides an accurate representation of the motion.However, current systems do not provide the real-time performance andaccuracy demanded by many applications. Several currently availablemotion capture systems place markers on or near a subject's joints andthen use small groups of markers to determine the position andorientation of the subject. One disadvantage of these types of systemsis that the limitation on the position and number of the markers leadsto accuracy problems. Another disadvantage is that the markers caninterfere with the subject's motion.

Some currently available systems can provide accurate motion capture,but not in real-time. Those systems that provide real-time performancetypically sacrifice accuracy or limit the type or scope of motion thatcan be analyzed. Thus, there is a need for a motion capture system thatcan accurately capture motion real-time.

SUMMARY OF THE INVENTION

The present invention provides a method for capturing, measuring andanalyzing the motion of humans, animals, and mechanical devices inreal-time. A background image is created that corresponds to the motioncapture environment. The background image is used to create a thresholdimage. In one aspect of the invention, the threshold image is createdusing an auto thresholding feature. In the auto thresholding feature,each camera captures a series of images while the subject makesmovements similar to the ones to be captured. This series of images isused to adjust the various threshold parameters, which are used togenerate the threshold image for each camera. First, the center of theimage for curvature falloff is determined. Once the center of the imageis determined, the threshold intensity is lowered until the noise is toohigh. Noise includes hot spots and globs that are not associated withmarkers. The threshold intensity is then incremented until the noise islimited. After the threshold intensity is adjusted, the curvature islowered until the peripheral noise is too high. The threshold intensityand the curvature are then incremented until the noise is limited acrossthe field of view.

During motion capture, a captured image is compared to the thresholdimage on a pixel-by-pixel basis to locate hot pixels. A hot pixel is apixel in the captured image that has an intensity greater than thecorresponding pixel in the threshold image. Once the hot pixels arelocated, the pixels are analyzed to locate connected hot pixels(segments) and connected segments (globs). If the characteristics of theglobs satisfy the characteristics of the markers (or the subject in amarkerless capture), then the globs are selected for further analysis.The 3D locations for the candidate points corresponding to the selectedglobs are determined and are used to track the positions of thecandidate point between frames. The track attributes for the candidatepoints are compared to the expected attributes of the subject's motionand if there is a correlation, then the candidate points are used todetermine the subject's motion.

These and other aspects, features and advantages of the presentinvention may be more clearly understood and appreciated from a reviewof the following detailed description of the disclosed embodiments andby reference to the appended drawings and claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flow diagram illustrating a method for auto thresholding inaccordance with an embodiment of the invention.

FIG. 2 is a flow diagram illustrating in more detail a portion of theauto thresholding method of FIG. 1, in accordance with an embodiment ofthe invention.

FIG. 3 is a flow diagram illustrating in more detail another portion ofthe auto thresholding method of FIG. 1, in accordance with an embodimentof the invention.

FIG. 4 is a flow diagram illustrating in more detail another portion ofthe auto thresholding method of FIG. 1, in accordance with an embodimentof the invention.

FIG. 5 is a flow diagram illustrating in more detail another portion ofthe auto thresholding method of FIG. 1, in accordance with an embodimentof the invention.

FIG. 6 is a flow diagram illustrating in more detail another portion ofthe auto thresholding method of FIG. 1, in accordance with an embodimentof the invention.

FIG. 7 is a flow diagram illustrating a method for motion capture inaccordance with an embodiment of the invention.

DETAILED DESCRIPTION

The present invention provides a method for capturing, measuring andanalyzing the motion of humans, animals, and mechanical devices inreal-time. Briefly described, the present invention uses cameras tocapture the movement of a subject. In one embodiment, markers are placedon the subject, while in another embodiment markers are not used. Imagescaptured by the cameras are compared to a threshold image to identifycollections of hot pixels or globs. The globs are compared to expectedcharacteristics of the markers or the subject and are tracked betweenframes. Glob information is used to determine the location of themarkers or subject in each frame.

The Motion Capture Environment

The system uses a number of high-speed cameras to capture informationabout the locations of the markers associated with the subject (or thelocation of the subject in a markerless embodiment) as the subjectmoves. The cameras support high-speed image capture, as well ashigh-speed image processing. The cameras are connected to each other, aswell as to a central computer.

The cameras are synchronized so that their shutters open simultaneously.The shutter open time is variable and typically ranges from 1/1000 to1/4000 of a second depending upon the speed of the motion to becaptured. The shutters can be triggered by a signal from the centralcomputer or can be triggered using a synchronized clock signal withineach camera. The frame rate is based on the motion to be captured andcan be constant throughout the motion capture or can vary. For example,if a golf swing is being captured, the frame rate may be higher aroundthe point of impact. A frame rate of 2000 frames per second could beused for the 10 frames before and the 10 frames after the club impactsthe ball, and a frame rate of 200 frames per second could be used forthe remaining frames.

A spotlight is attached to each camera and is aligned with the camera'sline of sight so that the highly reflective material used for themarkers appears very bright in the camera image. The images from thecameras are digitized and the brightness of each pixel is determined inorder to identify bright regions in the image. The locations of thebright regions, as well as other characteristics of the regions are usedto determine the locations of the markers.

Typically, the cameras are placed around the room. Each cameradetermines 2D coordinates for each marker that it sees. The coordinateinformation for each marker from the cameras is calibrated so that the2D coordinates are transformed into 3D coordinates. The cameras can becalibrated by moving a single marker throughout the motion capture area.Alternatively, the cameras can be calibrated by moving multiple wandshaving a small number of markers throughout the motion capture area. Thefixed relative positions of the markers on the wand are used by thecalibration process to quickly calibrate the cameras. As the subjectmoves, the cameras capture the motion and provide marker data, whichdescribes the location of the markers in the 3D space.

Background Image

A background image represents the motion capture environment and iscalculated for each camera. The background image includes items thatwill be present throughout the motion capture, such as the other camerasand lights, and excludes the subject and any other objects that will bethe subject of the motion capture. A number of images taken over aperiod of time are used to create the background image to accommodatefluctuations in the background. The majority of the images are acquiredusing the same shutter speed as will be used for the motion capture. Thebackground image represents the maximum intensity of the acquiredimages. For a pixel having coordinates (x, y), the background image isdefined as shown below.

BackGround(x,y)=max {Im1(x,y), Im2(x,y), . . . , ImX(x,y)}

To accommodate fluorescent lights, which fluctuate dramatically inintensity from frame to frame at fast shutter speeds, one of the imagesis acquired using a slower shutter speed. In one embodiment, a shutterspeed of 8,333 microseconds is used since fluorescent lights cycle at120 Hz. For this image, only the pixels that are at the maximumintensity are included. The remaining pixels are set to zero.

Once the background image is determined, pixels that are at the maximumintensity can be expanded so that one or more pixels surrounding themaximum intensity pixels are set to the maximum intensity to accommodatesmall fluctuations in the light intensity or small movements of thebackground objects. The amount of expansion is selectable via aconfiguration file. In one embodiment, the amount of expansion is onepixel.

The background image can be used to mask certain portions of the image,either totally or partially. To totally mask a portion of the image, thepixels corresponding to that portion of the image are set to the maximumintensity. To partially mask a portion of the image, the pixelscorresponding to that portion of the image are set to an intensity thatexceeds the expected brightness. Partially masking a portion of theimage allows a marker to be detected, if the marker is located withinthe partially masked portion of the image since the intensity of themarker should exceed the intensity of the background image. Maskingcertain portions of the image, either totally or partially, helps speedthe processing of the images.

Threshold Image

The background image is used to determine a threshold image. A thresholdimage includes pixel intensities and determines which portions of amotion capture image are used for further processing and which portionsare discarded. For each pixel of the captured image, if the pixelintensity of the captured image is above the pixel intensity of thethreshold image, then the pixel is used. If the pixel intensity of thecaptured image is not above the pixel intensity of the threshold image,then the pixel is discarded.

For a pixel having coordinates (x, y), an exemplary threshold image isbased on certain threshold parameters (threshold, thresh_xc, thresh_yc,thresh_curve, thresh_bg) and is defined as shown below.

T(x,y)=min(maximum intensity, threshold/(1+[(x-thresh_(—)xc)²+(y-thresh_(—) yc)²]/thresh_curve²)+bg(x,y)*thresh_(—) bg/100)

wherethreshold is the grey intensity that is added to the background image.thresh_xc is the center x coordinate.thresh_yc is the center y coordinate.thresh_curve is the amount of curvature of the threshold image. In oneembodiment, a setting of 400 indicates that pixels that are 400 pixelsaway from the center have an intensity that is half the intensity of thecenter threshold value.bg(x,y) is the intensity of the pixel in the background image atcoordinates (x,y).thresh_bg is a percentage multiplier for the background image. Forexample, a value of 120 indicates that the threshold image intensity is1.2 times the background image intensity.

Any pixel in the threshold image that is set to the maximum intensity ismasked and will not be used for motion capture. Masking is typicallyused to mask stationary hot objects in the motion capture area, such aslights or reflective surfaces. The maximum intensity value depends uponthe number of grey-scale bits used. For example, an 8-bit grey-scaleprovides values between 0 and 255, while a 10-bit grey-scale providesvalues between 0 and 1023.

A threshold image is independently created for each camera. Although auser can select the values of the five parameters used to calculate thethreshold image, an auto thresholding feature is available and isdescribed below.

Auto Thresholding

The objective of the auto thresholding feature is to generate values forthe threshold parameters that yield a threshold image that is as low aspossible without being so low that a lot of spurious globs, aregenerated. Stationary noise is eliminated by the thresh_bg parameter,which multiplies the background image by a certain factor to accommodatefluctuations in the scene and the CCD image noise. During autothresholding the subject is in the motion capture area and is preferablymoving in a manner similar to the motion to be captured.

A number of auto thresholding parameters are initially set by the userand are based on characteristics of the subject and/or the expectedmotion. In one embodiment, the parameter values are selected throughtrial and error to lower the number of iterations needed during autothresholding. Shown below are exemplary auto thresholding parameters.

-   at_min_globs: minimum number of globs that need to be visible when    the threshold is being lowered.-   at_max globs: number of globs any camera should see. Used to    determine whether the threshold is set too low.-   at_max_large_globs: number of globs over size at_max_area permitted    in the field of view.-   at_num_frames: number of frames the auto thresholding routine can    consume. The process will automatically terminate if this many    frames have been used by the auto thresholding routine. Default is    1000, but can be set as low as 200. A larger number of frames allows    for longer inspection of the scene, which is useful if the subjects    are moving in the field of view.-   at_radius: maximum distance for a glob to be considered as a center    glob. Any globs further from the center are considered peripheral    globs.-   at_max_far_globs: maximum number of peripheral globs allowed in the    field of view. This affects how the optimal curvature is determined.-   at_max_area: maximum area or size of a glob that is allowed. No glob    should be larger than this. If so, those globs are counted in the    max_large_glob calculation.-   at_threshold_step: percentage the threshold step size should be    incremented or decremented.-   at_curve_step: percentage the curvature should be incremented or    decremented.-   at_noise: maximum allowed noise in the field of view across several    acquired images. Noise is defined as globs that flicker between    subsequent images. A few flickering globs are allowed, but if there    are too many, then the system will have to eliminate those noise    markers based on triangulation errors it has to compute for each    glob combination.-   at_noise_iter: number of good iterations that have to occur during    the noise elimination steps (steps 400 and 600 of FIG. 1). A good    iteration is one in which the threshold parameters do not have to be    modified (i.e. no noise was noticed). Defaults to 8.-   at_cycles: number of frames (image taking cycles) to be taken within    one iteration. Defaults to 6.-   at_reset_cycles: if this parameter is set the good iterations    counter is reset if at any point noise is detected. This effectively    means that with this parameter set there needs to be at_noise_iter    subsequent good iterations before an optimization step is    terminated.-   at_hang_up: specifies if the connection to the auto thresholding    service should be terminated after the thresholding has finished. If    this is not the case, it will immediately jump into a glob finding    service without further intervention. Default is on.-   at_modify_curve: allows for different ways in which the starting    value for the curvature is found. With this parameter set to 0, auto    thresholding will try to find a lowest setting for the curvature;    with it set to 1 it will start with the curvature set to    at_min_curve (configuration setting); with it set to 2 it will    update the curvature setting to the lowest setting and in the    subsequent stage only look for peripheral noise (ignoring the center    noise) thus not incrementing the threshold value either; with it set    at 3 the effect is the same as if set to 2, except that all noise in    the field of view detected will increment the curvature setting.    Default setting is 1.-   at_min_curve: minimum curvature used as the lower starting point for    thresh_curve if the at_modify_curve is set to 1 or higher. Default    is 200.-   at_keep_running: when set will complete all allowed frames for auto    thresholding. If not, auto thresholding will finish as soon as    enough good cycles have been completed. Default is on.

The auto thresholding parameters are used to calculate a thresholdimage. FIG. 1 illustrates the steps performed during auto thresholding.In step 200, the center of the subject's image for curvature falloff isdetermined. Once the center of the image is determined, the thresholdintensity is lowered until the noise is too high in step 300. Once thenoise is too high, the threshold intensity is incremented until thenoise is limited in step 400. Once the noise is limited, the curvatureis lowered until the peripheral noise is too high in step 500. Once itis determined that the peripheral noise is too high, then the thresholdand curvature are incremented until the noise is limited across thefield of view in step 600. Details of each of these steps are providedin connection with FIGS. 2-6.

FIG. 2 provides additional details of finding the center of the imagefor curvature falloff. In step 202, the initial threshold parameters areset to establish an initial threshold image. The initial thresholdparameters can be stored on the camera or updated from the centralcomputer. The parameter values are determined empirically for theparticular motion capture environment. In step 204, an image iscaptured. Typically the image is captured without threshold curvatureand using the current settings for the threshold parameters. In step206, the number and size of globs in the image are determined. A globincludes a number of connected hot segments and a hot segment includes anumber of connected hot pixels. The method for determining orrecognizing globs is described in the section entitled “Glob Detection”below. In step 208, a determination is made as to whether the number ofglobs is greater than two times the minimum number of expected globs, asspecified by the at_min_lobs parameter. If so, then the yes branch isfollowed to step 210 and the centers of all of the globs are averaged todetermine the center of the image. Once the center of the image isdetermined, then the threshold parameters, thresh_xc and thresh_yc areupdated in step 212.

If the number of globs is not greater than two times the minimum numberof expected globs in step 208, then the method proceeds to step 214. Instep 214, the threshold intensity values are halved. The method thenreturns to step 204 and captures another image.

FIG. 3 provides additional details for the step of lowering thethreshold intensity until the noise is too high. In step 302, thethreshold parameters are set to the values used in step 202, i.e., thevalues are set to the values that were set prior to the start of autothresholding. In step 304, an image is captured without thresholdcurvature using the current threshold parameters. In step 306 the numberand size of the globs in the image is determined. In step 308, adetermination is made as to whether the number of globs exceeds theexpected number of globs (at_max_globs), the number of large globsexceeds the expected number of large globs (at_max_large_globs), or thethreshold is at or below its lower limit. If any of these conditions aresatisfied, then the threshold intensity is low enough. The yes branch isfollowed to step 310 and auto thresholding continues with step 400. Ifnone of the conditions are satisfied, then the no branch is followed tostep 312 where the threshold intensity is lowered by a predeterminedvalue specified by at_threshold_step %. The method then returns to step304 and repeats.

FIG. 4 illustrates in more detail the step of incrementing the thresholdintensity until the noise is limited. In step 404, a number of imagesare captured. The number of images is equal to the value specified byat_cycles. In step 406, the number and size of the globs in each imageis determined. In step 408, the noise is computed for this series ofimages. In particular, step 408 determines if the number of globsexceeds the expected number of globs (at_max_globs) or if the change inthe number of globs exceeds a tolerance specified by at_noise and thechange in the number of single pixel globs exceeds the tolerancespecified by at_noise; and if the number of globs in any frame exceedsthe minimum number of globs (at_min_globs) expected. If so, then thereis too much noise and the yes branch is followed to step 410. In step410, the threshold intensity value is incremented by an amount specifiedby at_threshold_step % and the method returns to step 404 and repeats.

If the conditions specified in step 408 are not satisfied, then the nobranch is followed to step 412 and the number of good cycles isincremented by one. In step 414, the number of good cycles is comparedto a predetermined number specified by at_noise_iter. If the number ofgood cycles is equal to the predetermined number, then the methodproceeds to step 416 and continues with step 500. If the number of goodcycles does not equal the predetermined number, then the no branch isfollowed and the method returns to step 404 and repeats.

FIG. 500 provides additional details of the step of lowering thecurvature until the peripheral noise is too high. In step 502, adetermination is made as to whether the starting curvature value iszero. If so, then the method proceeds to step 504 and an image iscaptured. In step 506 the number of peripheral globs in the image isdetermined. A peripheral glob is one that is further from the centerthan a distance specified by at_radius. In step 508, a determination ismade as to whether the number of peripheral globs exceeds apredetermined threshold specified by at_max_far_globs. If so, then themethod proceeds to step 510 and continues to step 600.

If the number of peripheral globs does not exceed the predeterminedthreshold, then the no branch is followed to step 516 where thecurvature is lowered by an amount specified by at_curve_step %. Themethod then returns to step 504 and repeats.

If the starting curvature value, at_modify_curve, is not set to zero,then the no branch is followed from step 502 to step 512. In step 512,the curvature is set to at_min_curve and the method proceeds to step 514and continues with step 600.

FIG. 6 provides additional details of the step of incrementing thethreshold intensity and curvature until the noise is limited across thefield of view. FIG. 6 is similar to FIG. 4, except that both thethreshold and the curvature are optimized in FIG. 6. If noise isdetected in the center, then the threshold is incremented. If noise isdetected in the periphery, then the curvature is incremented.

In step 602, a number of images are captured. The number of images isspecified by at_cycles. In step 604, the number and size of the globsare determined and in step 606 the globs are determined to be eithercenter globs or peripheral globs. Step 608 determines whether there isany noise in the center. In particular, step 608 determines if thenumber of center globs exceeds the expected number of globs(at_max_globs) or if the change in the number of center globs exceeds anoise threshold specified by at_noise and the change in the number ofsingle pixel globs exceeds the noise threshold specified by at_noise;and if the number of center globs in any frame exceeds the minimumnumber of globs expected (at_min_globs). If so, then there is too muchnoise and the yes branch is followed to step 610. In step 610, thethreshold intensity value is incremented by a predetermined amountspecified by at_threshold_step %. The method then returns to step 604and repeats.

If the conditions in step 608 are not satisfied, indicating that thereis no noise in the center, then the no branch is followed to step 612.In step 612, a determination is made as to whether there is noise in theperiphery. In particular, step 612 determines if the number ofperipheral globs exceeds the expected number of globs (at_max globs) orif the change in the number of peripheral globs exceeds a noisethreshold specified by at_noise and the change in the number of singlepixel globs exceeds the noise threshold specified by at_noise; and ifthe number of peripheral globs in any frame exceeds the minimum numberof globs expected (at_min_globs). If so, then there is too much noiseand the yes branch is followed to step 614. In step 614, the thresholdcurvature value is incremented by a predetermined amount specified byat_curve step.

If the conditions in step 612 are not satisfied, indicating that thereis no noise in the periphery, then the method proceeds to step 616. Instep 616, a counter indicating the number of good frames is incremented.In step 618, the number of good frames is compared to a thresholdspecified by at_num_frames. If the number of good frames meets thethreshold, then the method proceeds to step 620 and ends. If the numberof good frames does not meet the threshold, then the no branch isfollowed back to step 602 and the method repeats.

Calculating the threshold image is a computationally intensive processsince a separate calculation is performed for each pixel. Performing thecalculation without using the curvature significantly speeds up thecalculation. If the curvature is set to a relatively high value, thenthe threshold image is calculated without the curvature present. In oneembodiment if thresh_curve is set to a value of 4000, which is themaximum allowed and which has the minimal affect on the threshold image,then the threshold image is calculated without the curvature. In thisembodiment the first three steps illustrated in FIG. 1 are performedwith the curvature set to a value of 4000, thus, performing thecalculation without using the curvature. As will be apparent to thoseskilled in the art, as processing capacity increases, it may bepreferable to use the curvature. If so, then the previously describedsteps would be modified to use the curvature parameter.

Motion Capture

To capture the motion of a subject, the cameras capture a series ofimages. The images are processed to identify the locations of themarkers (or the subject) from frame to frame and can be used to animatea graphical model representing the subject's motion. FIG. 7 illustratesthe processing of the captured images. In step 700, each camera capturesan image. Each captured image is compared to a threshold image to detectglobs, in step 702. Once the globs are detected, the globs are evaluatedto determine whether the characteristics of the globs satisfy certainpredetermined characteristics, in step 704. If so, then the 3Dcoordinates for the candidate points corresponding to the globs arecalculated using information from all of the cameras in step 706. Instep 708, the relative locations of the candidate points are evaluatedacross frames to identify a track and track attributes in order toidentify a candidate for further analysis. In one embodiment, steps700-704 are performed on the cameras and steps 706-708 are performed onthe central computer. However, as the cameras become more powerful, moreof the processing will be performed on the cameras. Additional detailsof the steps are provided below.

Glob Detection

Each image captured by the camera is compared to the threshold image ona pixel-by-pixel basis. The intensity of each pixel of the capturedimage is compared to the intensity of the pixels of the threshold image.If the intensity of the pixel of the captured image is greater than theintensity of the pixel of the threshold image, then the pixel is markedas a hot pixel. Once all of the pixels are compared, the information isused to generate an RLE (run length encoding). The RLE is a method ofdescribing the locations of all the hot pixels in the captured image.The RLE is a collection of segments, where a segment is defined as asingle hot pixel or a series of connected hot pixels on a line. The RLEis stored in such a manner that a line number and the beginning andending pixels of a segment on the line are encoded together with anindex for each segment.

Each line that includes any hot pixels is encoded using a number ofshorts (two bytes). The first short corresponds to the line number andthe second short corresponds to the number of hot segments in the line.For each hot segment, additional shorts are used to identify the hotsegments. The first short is the first hot pixel in the segment, thesecond short is the last hot pixel in the segment and the third short isthe segment index. Shown below is an example.

Threshold 01 02 04 06 06 04 05 06 06 02 50 80 80 Image Captured 00 01 0416 20 14 06 04 01 00 60 65 68 Image Hot/Cold C C C H H H H C C C H C C

The first line of the example shown above represents the pixel intensityof the threshold image and the second line represents the pixelintensity of the captured image. The third line indicates whether theintensity of the pixel of the captured image is greater than theintensity of the corresponding pixel of the threshold image, i.e. thepixel is hot. Assuming that the above lines correspond to line 50, thenthe information is encoded as follows.

0050 0002 0003 0006 xxxx 0010 0010 xxxx

The first short represents the line number (0050) and the second shortrepresents the number of hot segments (0002). The third short representsthe first hot pixel of the first hot segment (0003), the fourth shortrepresents the last hot pixel of the first hot segment (0006), and thefifth short represents the segment index. The first hot segment is fourpixels long and begins at pixel 3 and ends at pixel 6. The sixth shortrepresents the first hot pixel of the second hot segment (0010), theseventh short represents the last hot pixel of the second hot segment(0010), and the eighth short represents the segment index. The secondhot segment is one pixel long and begins and ends at pixel 10. Since thesegment indexes are not yet defined, they are designated as xxxx.

The segment indexes indicate which hot segments are connected. Forexample, if a hot segment on line 31 begins at pixel 101 and ends atpixel 105 and a hot segment on line 32 includes any pixel from 101 to105 inclusive, then the two hot segments are connected and are assignedthe same index number. Connected hot segments are referred to herein asglobs. Each glob is identified by a single index number that is uniquefor the frame.

In some circumstances, a single glob may be initially identified as twoor more globs. Consider for example a “U” shaped glob. Initially the twolegs of the U receive different index numbers. However, when the bottomof the U is processed, it is discovered that the two legs are connected.In this situation, the index numbers are modified so that the U-shapedglob is identified by a single index number.

Glob Discrimination

Once the globs are identified, the globs are compared to thecharacteristics of the markers (if markers are used) or the subject (ifmarkers are not used). For each glob, the number of hot pixels, abounding box, a fill factor and the center of gravity are calculated.The bounding box is a regularly shaped area, such as a square, thatcontains the glob and is used to compare the shape of the glob to theshape of the marker or subject. The fill factor is computed by dividingthe area of the glob by the area of the bounding box. In one embodiment,the area of the glob is determined by assuming that the glob is roughlycircular in shape and calculating the area of a circle.

The center of gravity can be calculated based on whether the pixels arehot or cold or can be based on the grey-scale levels of the pixels. Thecenter of gravity calculation can consider pixels that are below thethreshold, but border a hot pixel. Consider a glob consisting of asingle hot pixel located at (100, 100) with bordering intensities asshown below. The threshold intensity for the corresponding thresholdimage is 50.

$\begin{matrix}\; & 099 & 100 & 101 \\099 & 4 & 4 & 4 \\100 & 5 & 60 & 45 \\101 & 4 & 10 & 4\end{matrix}$

If only the hot pixel is considered, then the center of gravity iscalculated as (100, 100). However, if the bordering pixels areconsidered, then the center of gravity is calculated as (100.286,100.043).

The characteristics of the globs are compared to the expectedcharacteristics of the markers or subjects. For example, the size of aglob (number of hot pixels) is compared to the expected size of amarker. If the glob is too small or too big, then it is discarded fromfurther processing. In addition, the shape of the glob (the bounding boxof the glob) is compared to the expected shape or proportions of amarker. In one embodiment, if the bounding box is elongated more than apredetermined amount (e.g. width is more than three times height), thenthe glob is discarded, since the markers are round spheres. In thisembodiment, an oblong or elongated bounding box likely results fromreflections from shiny surfaces, such as door or window frames.

The fill factor of the glob is also compared to an expected fill factor.In one embodiment, a fill factor of between 40% and 60% is used. Thefill factor is used to eliminate globs that are hollow or diagonallyelongated. The criteria for size, shape and fill factor are based on theknown characteristics of the markers or subject and thus, will differbased on the markers or subject to be captured. Additional criteria mayalso be used depending upon the characteristics of the marker orsubject. If the characteristics of the glob meet the expectedcharacteristics, then the glob is identified for further processing.

3D Location

Glob detection is performed on a frame-by-frame basis for each imagegenerated by each camera. A set of globs, G_(c), is generated for eachframe from each camera. To determine the 3D coordinates for thecandidate points corresponding to the globs, a set of 3D rays R_(c) isconstructed from each set of globs. The form of each image ray R is:

R=P _(R) +d*D _(R)

whereP_(R) is the origin of the ray (the camera position).D_(R) is the normalized direction of the ray.d is a distance (to be determined via triangulation) that the point isalong the ray.

Triangulation, with a specified error tolerance (typically 0.8) of therays R_(c) across all cameras gives a set of 3D points, M_(t). Thepoints, M_(t), represent candidate points and areas of interest in thescene. These points are further evaluated based on their relativemovement from frame to frame.

Candidate Identification

Over time frames t1, t2, . . . , tn, a sequence of marker sets, M_(t1),M_(t2), . . . , M_(tn), is generated. The relative locations of themarker sets from frame to frame are evaluated to identify tracks andtrack attributes. For example, a point in one marker set is compared toa point in a subsequent marker set to determine the “closeness” of thepositions. If the points are close, then the points may be deemed to bepart of the same track and a track number is assigned to the points. Thecriteria used to evaluate closeness, such as the relative positions fromframe to frame, the number of consecutive frames of closeness, thenumber of frames 3Q without closeness, etc., are based on the object orsubject that is being captured. Once a track, Ti, is identified, thetrack is assigned a type, such as car or person, based on the values ofthe various attributes of the track. The tracks that are assigned a typethat corresponds to the object or subject of interest are identified ascandidates.

Markerless Tracking

Markers are not suitable for certain objects, such as balls, since theymay interfere with the subject's interaction with the object. Forexample, placing markers on a baseball is likely to interfere with thepitching motion of a pitcher. Therefore, the system can capture themotion of an unmarked object by locating globs with characteristics thatcorrespond to the known characteristics of the object. If the capturedmotion includes a subject fitted with markers and an unmarked object,such as a ball, then there are separate glob detection parametersdefined for the markers and the unmarked object.

In one embodiment, the parameters for an unmarked object, such as aball, include a region of interest, a minimum area, a maximum area, anaspect ratio, a maximum intensity, a threshold and a shutter speed. Theregion of interest parameters are defined to include the expectedlocation of the object. Typically the region of interest is a regularlyshaped region, such as a square. The minimum area and the maximum areaspecify the expected size of the object and the aspect ratio specifiesthe expected proportions of the object. The threshold parameter definesan intensity value for the region of interest within the thresholdimage. The maximum intensity specifies a maximum intensity for the glob.Since an unmarked object is not as bright as a marker, a lower thresholdintensity is used for the threshold image and the maximum intensity forthe glob is lower than expected intensity for a marker.

In an embodiment that only uses markerless tracking, the shutter speedis longer than the shutter speeds used for marker-based tracking. In anembodiment that combines marker-based tracking and markerless tracking,the shutter speed is the shutter speed used for marker-based tracking.The unmarked object is tracked by comparing the captured image to thepreviously captured image with an offset. The previously captured imageplus the offset acts as the threshold image for the unmarked object. Theoffset is typically in the order of six to twelve grey scale values dueto the fast shutter speeds overall dark images. Thus, two RLEs aregenerated for combined marker-based and markerless tracking, one fortracking the markers and a second for tracking the unmarked objects.

The number and type of parameters are based on the known characteristicsof the unmarked object and thus, will differ based on the object to becaptured. Additional methods for tracking an object without the use ofmarkers are possible and include the head tracking method described inU.S. patent application Ser. No. TBD entitled “System and Method forMotion Capture,” which is incorporated herein by reference.

Additional alternative embodiments will be apparent to those skilled inthe art to which the present invention pertains without departing fromits spirit and scope. In particular, the parameters described herein areexemplary and additional or alternative parameters can be used. Theinvention can be used with multiple sets of parameters forsimultaneously tracking different types of objects. The objects caninclude a combination of objects with markers and objects withoutmarkers. Markerless tracking is not limited to balls or person's heads,but can be applied to any type of subject. The image processingdescribed herein can be performed on the central computer or on thecameras depending upon the particular hardware configuration used.Accordingly, the scope of the present invention is described by theappended claims and is supported by the foregoing description.

1. A method for capturing motion, comprising: receiving a captured imagehaving a plurality of pixels arranged in rows and columns; for eachpixel of the captured image, comparing an intensity of the pixel of thecaptured image to an intensity of a corresponding pixel in a thresholdimage; if the intensity of the pixel of the captured image exceeds theintensity of the corresponding pixel in the threshold image, thendesignating the pixel of the captured image as a hot pixel; comparinghot pixels in adjacent rows to determine related hot pixels; designatingthe related hot pixels as a glob; comparing characteristics of the globwith predetermined characteristics of an item to track the motionassociated with the item.
 2. The method of claim 1, further comprising:if the characteristics of the glob satisfy the predeterminedcharacteristics of the item, then determining three-dimensionalcoordinates for candidate points corresponding to the glob; determiningtrack attributes for the candidate points, wherein the track attributesdescribe the candidate point's movement from frame to frame; comparingthe track attributes for the candidate points to expected attributes ofthe motion of the item; and if the track attributes satisfy the expectedattributes, then determining that the candidate points correspond to theitem.
 3. The method of claim 1 wherein the item is a reflective marker.4. The method of claim 1, wherein the item is an object.
 5. A method forcapturing a background image, comprising: for each of a plurality ofcameras, capturing a plurality of images over a period of time using apredetermined shutter speed, wherein each image has a plurality ofpixels; for each pixel of a selected captured image, comparing anintensity of the pixel with an intensity of a corresponding pixel ofother ones of the captured images; determining a maximum intensity ofthe pixels; and using the maximum intensity of the pixels as theintensity for the corresponding pixel of the background image.
 6. Themethod of claim 5, further comprising: capturing a second plurality ofimages over a second period of time using a second predetermined shutterspeed; for each pixel of the second captured images: comparing anintensity of the pixel with a predetermined intensity threshold; if thepixel satisfies the predetermined intensity threshold, then setting thecorresponding pixel in the background image to the predeterminedintensity threshold.
 7. A method for capturing motion associated with afirst item, wherein the first item is associated with a marker and asecond item, wherein the second item is not associated with a marker,comprising: receiving a captured image having a plurality of pixelsarranged in rows and columns; comparing the captured image to athreshold image on a pixel-by-pixel basis; based on the comparison,identifying globs within the captured image; comparing the globs to afirst set of criteria, wherein the first set of criteria correspond tocharacteristics of the marker; comparing the globs to a second set ofcriteria, wherein the second set of criteria correspond tocharacteristics of the second item; based on the comparison to the firstset of criteria and the second set of criteria, identifying any globsthat correspond to the marker and identifying any globs that correspondto the second item to track the motion of the marker and the seconditem.
 8. The method of claim 7, wherein a region of interest defines anexpected range of locations for the second item, and wherein comparingthe captured image to a threshold image on a pixel-by-pixel basis,comprises: for each pixel of the captured image, comparing an intensityof the pixel of the captured image to an intensity of a correspondingpixel in the threshold image; and for each pixel of the captured imagethat corresponds to the region of interest, comparing the intensity ofthe pixel of the captured image to a predetermined intensity associatedwith the second item.
 9. A method for creating a threshold image,comprising: determining a center of an image for curvature falloff;lowering a threshold intensity until detected noise exceeds apredetermined threshold; raising the threshold intensity until thedetected noise satisfies a second predetermined threshold; lowering athreshold curvature until detected peripheral noise exceeds a thirdpredetermined threshold; and raising the threshold intensity and thethreshold curvature until the detected noise satisfies a thirdpredetermined threshold and the detected peripheral noise satisfies afourth predetermined threshold, wherein determining the detected noiseand the peripheral noise includes determining a number of globs in theimage and comparing the globs to certain thresholding parameters. 10.The method of claim 9, wherein the third predetermined thresholdcorresponds to noise in the center of the image and the fourthpredetermined threshold corresponds to peripheral noise in the image.11. The method of claim 10, wherein raising the threshold intensity andthe threshold curvature comprises: raising the threshold intensity tolimit noise in the center of the image; and raising the curvature toaddress noise in the periphery of the image.